Overview

Dataset statistics

Number of variables22
Number of observations1296675
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory217.6 MiB
Average record size in memory176.0 B

Variable types

Categorical13
Numeric9

Dataset

DescriptionThis dataset is about to credit card fraud and which include around 21 feature and 1 target.
URL

Alerts

trans_date_trans_time has a high cardinality: 1274791 distinct values High cardinality
merchant has a high cardinality: 693 distinct values High cardinality
first has a high cardinality: 352 distinct values High cardinality
last has a high cardinality: 481 distinct values High cardinality
street has a high cardinality: 983 distinct values High cardinality
city has a high cardinality: 894 distinct values High cardinality
state has a high cardinality: 51 distinct values High cardinality
job has a high cardinality: 494 distinct values High cardinality
dob has a high cardinality: 968 distinct values High cardinality
trans_num has a high cardinality: 1296675 distinct values High cardinality
zip is highly correlated with long and 1 other fieldsHigh correlation
lat is highly correlated with merch_latHigh correlation
long is highly correlated with zip and 1 other fieldsHigh correlation
merch_lat is highly correlated with latHigh correlation
merch_long is highly correlated with zip and 1 other fieldsHigh correlation
zip is highly correlated with long and 1 other fieldsHigh correlation
lat is highly correlated with merch_latHigh correlation
long is highly correlated with zip and 1 other fieldsHigh correlation
merch_lat is highly correlated with latHigh correlation
merch_long is highly correlated with zip and 1 other fieldsHigh correlation
zip is highly correlated with long and 1 other fieldsHigh correlation
lat is highly correlated with merch_latHigh correlation
long is highly correlated with zip and 1 other fieldsHigh correlation
merch_lat is highly correlated with latHigh correlation
merch_long is highly correlated with zip and 1 other fieldsHigh correlation
state is highly correlated with zip and 5 other fieldsHigh correlation
zip is highly correlated with state and 4 other fieldsHigh correlation
lat is highly correlated with state and 4 other fieldsHigh correlation
long is highly correlated with state and 4 other fieldsHigh correlation
city_pop is highly correlated with stateHigh correlation
merch_lat is highly correlated with state and 4 other fieldsHigh correlation
merch_long is highly correlated with state and 4 other fieldsHigh correlation
amt is highly skewed (γ1 = 42.27787379) Skewed
trans_date_trans_time is uniformly distributed Uniform
trans_num is uniformly distributed Uniform
trans_num has unique values Unique

Reproduction

Analysis started2023-02-01 11:25:13.936968
Analysis finished2023-02-01 11:27:14.744551
Duration2 minutes and 0.81 seconds
Software versionpandas-profiling v3.1.0
Download configurationconfig.json

Variables

trans_date_trans_time
Categorical

HIGH CARDINALITY
UNIFORM

Distinct1274791
Distinct (%)98.3%
Missing0
Missing (%)0.0%
Memory size9.9 MiB
2020-06-02 12:47:07
 
4
2020-06-01 01:37:47
 
4
2019-04-22 16:02:01
 
4
2019-12-01 19:39:27
 
3
2019-01-01 16:52:19
 
3
Other values (1274786)
1296657 

Length

Max length19
Median length19
Mean length19
Min length19

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1253218 ?
Unique (%)96.6%

Sample

1st row2019-01-01 00:00:18
2nd row2019-01-01 00:00:44
3rd row2019-01-01 00:00:51
4th row2019-01-01 00:01:16
5th row2019-01-01 00:03:06

Common Values

ValueCountFrequency (%)
2020-06-02 12:47:074
 
< 0.1%
2020-06-01 01:37:474
 
< 0.1%
2019-04-22 16:02:014
 
< 0.1%
2019-12-01 19:39:273
 
< 0.1%
2019-01-01 16:52:193
 
< 0.1%
2019-12-30 15:25:563
 
< 0.1%
2020-05-29 18:21:243
 
< 0.1%
2019-12-31 21:33:303
 
< 0.1%
2019-11-18 23:03:493
 
< 0.1%
2019-07-21 14:05:373
 
< 0.1%
Other values (1274781)1296642
> 99.9%

Length

2023-02-01T11:27:14.824716image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
2019-12-086428
 
0.2%
2019-12-156425
 
0.2%
2019-12-226325
 
0.2%
2019-12-296320
 
0.2%
2019-12-016283
 
0.2%
2019-12-096252
 
0.2%
2019-12-026150
 
0.2%
2019-12-166127
 
0.2%
2019-12-306064
 
0.2%
2019-12-235937
 
0.2%
Other values (86927)2531039
97.6%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

cc_num
Real number (ℝ≥0)

Distinct983
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4.171920421 × 1017
Minimum6.041620718 × 1010
Maximum4.992346398 × 1018
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size9.9 MiB
2023-02-01T11:27:14.910990image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum6.041620718 × 1010
5-th percentile6.304848798 × 1011
Q11.800429465 × 1014
median3.521417321 × 1015
Q34.642255475 × 1015
95-th percentile4.497913966 × 1018
Maximum4.992346398 × 1018
Range4.992346338 × 1018
Interquartile range (IQR)4.462212529 × 1015

Descriptive statistics

Standard deviation1.308806447 × 1018
Coefficient of variation (CV)3.1371798
Kurtosis6.179949935
Mean4.171920421 × 1017
Median Absolute Deviation (MAD)3.076470873 × 1015
Skewness2.851879006
Sum-6.725541877 × 1018
Variance1.712974316 × 1036
MonotonicityNot monotonic
2023-02-01T11:27:15.027213image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
5.713652351 × 10113123
 
0.2%
4.512828415 × 10183123
 
0.2%
3.672269902 × 10133119
 
0.2%
2.131124026 × 10143117
 
0.2%
3.54510934 × 10153113
 
0.2%
6.534628261 × 10153112
 
0.2%
6.011367958 × 10153110
 
0.2%
2.720433096 × 10153107
 
0.2%
6.011438889 × 10153106
 
0.2%
6.011109737 × 10153101
 
0.2%
Other values (973)1265544
97.6%
ValueCountFrequency (%)
6.041620718 × 10101518
0.1%
6.042292873 × 10101531
0.1%
6.042309813 × 1010510
 
< 0.1%
6.042785159 × 1010528
 
< 0.1%
6.048700208 × 1010496
 
< 0.1%
6.049059630 × 10101010
0.1%
6.049559311 × 1010518
 
< 0.1%
5.018029536 × 10111559
0.1%
5.018181333 × 10118
 
< 0.1%
5.018282048 × 1011515
 
< 0.1%
ValueCountFrequency (%)
4.992346398 × 10182059
0.2%
4.989847571 × 10181007
 
0.1%
4.980323468 × 1018532
 
< 0.1%
4.973530368 × 10181040
0.1%
4.958589672 × 10181476
0.1%
4.95682899 × 10182566
0.2%
4.911818931 × 10189
 
< 0.1%
4.906628656 × 10182584
0.2%
4.897067971 × 10181038
0.1%
4.890424427 × 10181496
0.1%

merchant
Categorical

HIGH CARDINALITY

Distinct693
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size9.9 MiB
fraud_Kilback LLC
 
4403
fraud_Cormier LLC
 
3649
fraud_Schumm PLC
 
3634
fraud_Kuhn LLC
 
3510
fraud_Boyer PLC
 
3493
Other values (688)
1277986 

Length

Max length43
Median length20
Mean length23.13259683
Min length13

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowfraud_Rippin, Kub and Mann
2nd rowfraud_Heller, Gutmann and Zieme
3rd rowfraud_Lind-Buckridge
4th rowfraud_Kutch, Hermiston and Farrell
5th rowfraud_Keeling-Crist

Common Values

ValueCountFrequency (%)
fraud_Kilback LLC4403
 
0.3%
fraud_Cormier LLC3649
 
0.3%
fraud_Schumm PLC3634
 
0.3%
fraud_Kuhn LLC3510
 
0.3%
fraud_Boyer PLC3493
 
0.3%
fraud_Dickinson Ltd3434
 
0.3%
fraud_Cummerata-Jones2736
 
0.2%
fraud_Kutch LLC2734
 
0.2%
fraud_Olson, Becker and Koch2723
 
0.2%
fraud_Stroman, Hudson and Erdman2721
 
0.2%
Other values (683)1263638
97.5%

Length

2023-02-01T11:27:15.143909image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
and474111
 
15.7%
llc97780
 
3.2%
inc91939
 
3.0%
sons73145
 
2.4%
ltd70853
 
2.3%
plc66475
 
2.2%
group50447
 
1.7%
fraud_kutch10560
 
0.3%
fraud_schaefer9394
 
0.3%
fraud_streich9250
 
0.3%
Other values (804)2069403
68.4%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

category
Categorical

Distinct14
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size9.9 MiB
gas_transport
131659 
grocery_pos
123638 
home
123115 
shopping_pos
116672 
kids_pets
113035 
Other values (9)
688556 

Length

Max length14
Median length11
Mean length10.52607862
Min length4

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowmisc_net
2nd rowgrocery_pos
3rd rowentertainment
4th rowgas_transport
5th rowmisc_pos

Common Values

ValueCountFrequency (%)
gas_transport131659
10.2%
grocery_pos123638
9.5%
home123115
9.5%
shopping_pos116672
9.0%
kids_pets113035
8.7%
shopping_net97543
7.5%
entertainment94014
7.3%
food_dining91461
 
7.1%
personal_care90758
 
7.0%
health_fitness85879
 
6.6%
Other values (4)228901
17.7%

Length

2023-02-01T11:27:15.240990image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
gas_transport131659
10.2%
grocery_pos123638
9.5%
home123115
9.5%
shopping_pos116672
9.0%
kids_pets113035
8.7%
shopping_net97543
7.5%
entertainment94014
7.3%
food_dining91461
 
7.1%
personal_care90758
 
7.0%
health_fitness85879
 
6.6%
Other values (4)228901
17.7%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

amt
Real number (ℝ≥0)

SKEWED

Distinct52928
Distinct (%)4.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean70.35103546
Minimum1
Maximum28948.9
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size9.9 MiB
2023-02-01T11:27:15.331601image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile2.44
Q19.65
median47.52
Q383.14
95-th percentile196.31
Maximum28948.9
Range28947.9
Interquartile range (IQR)73.49

Descriptive statistics

Standard deviation160.3160386
Coefficient of variation (CV)2.278801407
Kurtosis4545.644979
Mean70.35103546
Median Absolute Deviation (MAD)37.5
Skewness42.27787379
Sum91222428.9
Variance25701.23222
MonotonicityNot monotonic
2023-02-01T11:27:15.437031image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1.14542
 
< 0.1%
1.04538
 
< 0.1%
1.25535
 
< 0.1%
1.02533
 
< 0.1%
1.01523
 
< 0.1%
1.05519
 
< 0.1%
1.2516
 
< 0.1%
1.23515
 
< 0.1%
1.08512
 
< 0.1%
1.11509
 
< 0.1%
Other values (52918)1291433
99.6%
ValueCountFrequency (%)
1222
< 0.1%
1.01523
< 0.1%
1.02533
< 0.1%
1.03499
< 0.1%
1.04538
< 0.1%
1.05519
< 0.1%
1.06471
< 0.1%
1.07498
< 0.1%
1.08512
< 0.1%
1.09496
< 0.1%
ValueCountFrequency (%)
28948.91
< 0.1%
27390.121
< 0.1%
27119.771
< 0.1%
26544.121
< 0.1%
25086.941
< 0.1%
17897.241
< 0.1%
15305.951
< 0.1%
15047.031
< 0.1%
15034.181
< 0.1%
14849.741
< 0.1%

first
Categorical

HIGH CARDINALITY

Distinct352
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size9.9 MiB
Christopher
 
26669
Robert
 
21667
Jessica
 
20581
James
 
20039
Michael
 
20009
Other values (347)
1187710 

Length

Max length11
Median length6
Mean length6.080431874
Min length3

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowJennifer
2nd rowStephanie
3rd rowEdward
4th rowJeremy
5th rowTyler

Common Values

ValueCountFrequency (%)
Christopher26669
 
2.1%
Robert21667
 
1.7%
Jessica20581
 
1.6%
James20039
 
1.5%
Michael20009
 
1.5%
David19965
 
1.5%
Jennifer16940
 
1.3%
William16371
 
1.3%
Mary16346
 
1.3%
John16325
 
1.3%
Other values (342)1101763
85.0%

Length

2023-02-01T11:27:15.840085image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
christopher26669
 
2.1%
robert21667
 
1.7%
jessica20581
 
1.6%
james20039
 
1.5%
michael20009
 
1.5%
david19965
 
1.5%
jennifer16940
 
1.3%
william16371
 
1.3%
mary16346
 
1.3%
john16325
 
1.3%
Other values (342)1101763
85.0%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

last
Categorical

HIGH CARDINALITY

Distinct481
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size9.9 MiB
Smith
 
28794
Williams
 
23605
Davis
 
21910
Johnson
 
20034
Rodriguez
 
17394
Other values (476)
1184938 

Length

Max length11
Median length6
Mean length6.111177435
Min length2

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowBanks
2nd rowGill
3rd rowSanchez
4th rowWhite
5th rowGarcia

Common Values

ValueCountFrequency (%)
Smith28794
 
2.2%
Williams23605
 
1.8%
Davis21910
 
1.7%
Johnson20034
 
1.5%
Rodriguez17394
 
1.3%
Martinez14805
 
1.1%
Jones13976
 
1.1%
Lewis12753
 
1.0%
Gonzalez11799
 
0.9%
Miller11698
 
0.9%
Other values (471)1119907
86.4%

Length

2023-02-01T11:27:15.930335image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
smith28794
 
2.2%
williams23605
 
1.8%
davis21910
 
1.7%
johnson20034
 
1.5%
rodriguez17394
 
1.3%
martinez14805
 
1.1%
jones13976
 
1.1%
lewis12753
 
1.0%
gonzalez11799
 
0.9%
miller11698
 
0.9%
Other values (471)1119907
86.4%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

gender
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size9.9 MiB
F
709863 
M
586812 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowF
2nd rowF
3rd rowM
4th rowM
5th rowM

Common Values

ValueCountFrequency (%)
F709863
54.7%
M586812
45.3%

Length

2023-02-01T11:27:16.014864image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2023-02-01T11:27:16.074632image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
f709863
54.7%
m586812
45.3%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

street
Categorical

HIGH CARDINALITY

Distinct983
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size9.9 MiB
0069 Robin Brooks Apt. 695
 
3123
864 Reynolds Plains
 
3123
8172 Robertson Parkways Suite 072
 
3119
4664 Sanchez Common Suite 930
 
3117
8030 Beck Motorway
 
3113
Other values (978)
1281080 

Length

Max length35
Median length22
Mean length22.22902655
Min length12

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row561 Perry Cove
2nd row43039 Riley Greens Suite 393
3rd row594 White Dale Suite 530
4th row9443 Cynthia Court Apt. 038
5th row408 Bradley Rest

Common Values

ValueCountFrequency (%)
0069 Robin Brooks Apt. 6953123
 
0.2%
864 Reynolds Plains3123
 
0.2%
8172 Robertson Parkways Suite 0723119
 
0.2%
4664 Sanchez Common Suite 9303117
 
0.2%
8030 Beck Motorway3113
 
0.2%
29606 Martinez Views Suite 6533112
 
0.2%
1652 James Mews3110
 
0.2%
854 Walker Dale Suite 4883107
 
0.2%
40624 Rebecca Spurs3106
 
0.2%
594 Berry Lights Apt. 3923101
 
0.2%
Other values (973)1265544
97.6%

Length

2023-02-01T11:27:16.137901image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
apt327791
 
6.4%
suite305467
 
5.9%
island22954
 
0.4%
michael18967
 
0.4%
common17978
 
0.3%
station17957
 
0.3%
islands17917
 
0.3%
david17476
 
0.3%
brooks16991
 
0.3%
fields16321
 
0.3%
Other values (1940)4376722
84.9%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

city
Categorical

HIGH CARDINALITY

Distinct894
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size9.9 MiB
Birmingham
 
5617
San Antonio
 
5130
Utica
 
5105
Phoenix
 
5075
Meridian
 
5060
Other values (889)
1270688 

Length

Max length25
Median length8
Mean length8.652245937
Min length3

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowMoravian Falls
2nd rowOrient
3rd rowMalad City
4th rowBoulder
5th rowDoe Hill

Common Values

ValueCountFrequency (%)
Birmingham5617
 
0.4%
San Antonio5130
 
0.4%
Utica5105
 
0.4%
Phoenix5075
 
0.4%
Meridian5060
 
0.4%
Thomas4634
 
0.4%
Conway4613
 
0.4%
Cleveland4604
 
0.4%
Warren4599
 
0.4%
Houston4168
 
0.3%
Other values (884)1248070
96.3%

Length

2023-02-01T11:27:16.226699image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
city21314
 
1.3%
west19473
 
1.2%
north14425
 
0.9%
saint14363
 
0.9%
falls12794
 
0.8%
new11842
 
0.7%
mount11375
 
0.7%
lake11249
 
0.7%
san10260
 
0.6%
springs8727
 
0.5%
Other values (918)1482445
91.6%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

state
Categorical

HIGH CARDINALITY
HIGH CORRELATION

Distinct51
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size9.9 MiB
TX
94876 
NY
 
83501
PA
 
79847
CA
 
56360
OH
 
46480
Other values (46)
935611 

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNC
2nd rowWA
3rd rowID
4th rowMT
5th rowVA

Common Values

ValueCountFrequency (%)
TX94876
 
7.3%
NY83501
 
6.4%
PA79847
 
6.2%
CA56360
 
4.3%
OH46480
 
3.6%
MI46154
 
3.6%
IL43252
 
3.3%
FL42671
 
3.3%
AL40989
 
3.2%
MO38403
 
3.0%
Other values (41)724142
55.8%

Length

2023-02-01T11:27:16.314864image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
tx94876
 
7.3%
ny83501
 
6.4%
pa79847
 
6.2%
ca56360
 
4.3%
oh46480
 
3.6%
mi46154
 
3.6%
il43252
 
3.3%
fl42671
 
3.3%
al40989
 
3.2%
mo38403
 
3.0%
Other values (41)724142
55.8%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

zip
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct970
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean48800.6711
Minimum1257
Maximum99783
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size9.9 MiB
2023-02-01T11:27:16.409315image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum1257
5-th percentile7208
Q126237
median48174
Q372042
95-th percentile94569
Maximum99783
Range98526
Interquartile range (IQR)45805

Descriptive statistics

Standard deviation26893.22248
Coefficient of variation (CV)0.551083046
Kurtosis-1.096449332
Mean48800.6711
Median Absolute Deviation (MAD)23068
Skewness0.07968075775
Sum6.32786102 × 1010
Variance723245415.2
MonotonicityNot monotonic
2023-02-01T11:27:16.519079image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
737543646
 
0.3%
341123613
 
0.3%
480883597
 
0.3%
825143527
 
0.3%
496283123
 
0.2%
154843123
 
0.2%
851733119
 
0.2%
298193117
 
0.2%
387613113
 
0.2%
54613112
 
0.2%
Other values (960)1263585
97.4%
ValueCountFrequency (%)
12572023
0.2%
13301031
 
0.1%
1535515
 
< 0.1%
15451024
 
0.1%
1612519
 
< 0.1%
18432597
0.2%
18442058
0.2%
2180519
 
< 0.1%
26302090
0.2%
2908550
 
< 0.1%
ValueCountFrequency (%)
997831568
0.1%
9974712
 
< 0.1%
99746540
 
< 0.1%
993232572
0.2%
991603030
0.2%
9911615
 
< 0.1%
991131047
 
0.1%
990332458
0.2%
98836524
 
< 0.1%
98665500
 
< 0.1%

lat
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct968
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean38.53762161
Minimum20.0271
Maximum66.6933
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size9.9 MiB
2023-02-01T11:27:16.618770image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum20.0271
5-th percentile29.8826
Q134.6205
median39.3543
Q341.9404
95-th percentile45.8433
Maximum66.6933
Range46.6662
Interquartile range (IQR)7.3199

Descriptive statistics

Standard deviation5.075808439
Coefficient of variation (CV)0.1317104748
Kurtosis0.8129679455
Mean38.53762161
Median Absolute Deviation (MAD)3.3597
Skewness-0.1860276801
Sum49970770.51
Variance25.76383131
MonotonicityNot monotonic
2023-02-01T11:27:16.719419image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
36.3853646
 
0.3%
26.11843613
 
0.3%
42.51643597
 
0.3%
43.00483527
 
0.3%
44.59953123
 
0.2%
39.89363123
 
0.2%
33.28873119
 
0.2%
34.03263117
 
0.2%
33.47833113
 
0.2%
44.33463112
 
0.2%
Other values (958)1263585
97.4%
ValueCountFrequency (%)
20.02711527
0.1%
20.08271032
 
0.1%
24.65572584
0.2%
26.11843613
0.3%
26.3304542
 
< 0.1%
26.3771518
 
< 0.1%
26.42153038
0.2%
26.47222524
0.2%
26.5291549
0.1%
26.69391027
 
0.1%
ValueCountFrequency (%)
66.693312
 
< 0.1%
65.6899540
 
< 0.1%
64.75561568
0.1%
48.88783030
0.2%
48.88562066
0.2%
48.83281533
0.1%
48.66691047
 
0.1%
48.60312973
0.2%
48.47862038
0.2%
48.343088
0.2%

long
Real number (ℝ)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct969
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean-90.22633538
Minimum-165.6723
Maximum-67.9503
Zeros0
Zeros (%)0.0%
Negative1296675
Negative (%)100.0%
Memory size9.9 MiB
2023-02-01T11:27:16.824967image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum-165.6723
5-th percentile-119.0825
Q1-96.798
median-87.4769
Q3-80.158
95-th percentile-73.5112
Maximum-67.9503
Range97.722
Interquartile range (IQR)16.64

Descriptive statistics

Standard deviation13.75907695
Coefficient of variation (CV)-0.1524951323
Kurtosis1.855892285
Mean-90.22633538
Median Absolute Deviation (MAD)8.1527
Skewness-1.150107737
Sum-116994233.4
Variance189.3121984
MonotonicityNot monotonic
2023-02-01T11:27:16.922979image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
-98.07273646
 
0.3%
-81.73613613
 
0.3%
-82.98323597
 
0.3%
-108.89643527
 
0.3%
-79.78563123
 
0.2%
-86.21413123
 
0.2%
-111.09853119
 
0.2%
-82.20273117
 
0.2%
-90.51423113
 
0.2%
-73.0983112
 
0.2%
Other values (959)1263585
97.4%
ValueCountFrequency (%)
-165.67231568
0.1%
-156.292540
 
< 0.1%
-155.4881032
0.1%
-155.36971527
0.1%
-153.99412
 
< 0.1%
-124.44091043
0.1%
-124.21741547
0.1%
-124.15871031
0.1%
-124.14371526
0.1%
-123.97432036
0.2%
ValueCountFrequency (%)
-67.95032080
0.2%
-68.55651014
 
0.1%
-69.2675519
 
< 0.1%
-69.48282050
0.2%
-69.9576537
 
< 0.1%
-69.96563107
0.2%
-70.10319
 
< 0.1%
-70.2391036
 
0.1%
-70.30012090
0.2%
-70.34571527
0.1%

city_pop
Real number (ℝ≥0)

HIGH CORRELATION

Distinct879
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean88824.44056
Minimum23
Maximum2906700
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size9.9 MiB
2023-02-01T11:27:17.026634image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum23
5-th percentile139
Q1743
median2456
Q320328
95-th percentile525713
Maximum2906700
Range2906677
Interquartile range (IQR)19585

Descriptive statistics

Standard deviation301956.3607
Coefficient of variation (CV)3.399473825
Kurtosis37.6145193
Mean88824.44056
Median Absolute Deviation (MAD)2198
Skewness5.593853067
Sum1.151764315 × 1011
Variance9.117764376 × 1010
MonotonicityNot monotonic
2023-02-01T11:27:17.124885image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
6065496
 
0.4%
15957975130
 
0.4%
13129225075
 
0.4%
17664574
 
0.4%
2414533
 
0.3%
29067004168
 
0.3%
2760024155
 
0.3%
3024147
 
0.3%
9101484073
 
0.3%
1984067
 
0.3%
Other values (869)1251257
96.5%
ValueCountFrequency (%)
232049
0.2%
371013
 
0.1%
432034
0.2%
463040
0.2%
47511
 
< 0.1%
491054
 
0.1%
511016
 
0.1%
52518
 
< 0.1%
532610
0.2%
601045
 
0.1%
ValueCountFrequency (%)
29067004168
0.3%
25047002033
 
0.2%
2383912521
 
< 0.1%
15957975130
0.4%
15773852563
0.2%
15262063517
0.3%
14177938
 
< 0.1%
13824802056
0.2%
13129225075
0.4%
12633213629
0.3%

job
Categorical

HIGH CARDINALITY

Distinct494
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size9.9 MiB
Film/video editor
 
9779
Exhibition designer
 
9199
Naval architect
 
8684
Surveyor, land/geomatics
 
8680
Materials engineer
 
8270
Other values (489)
1252063 

Length

Max length59
Median length19
Mean length20.2271024
Min length3

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowPsychologist, counselling
2nd rowSpecial educational needs teacher
3rd rowNature conservation officer
4th rowPatent attorney
5th rowDance movement psychotherapist

Common Values

ValueCountFrequency (%)
Film/video editor9779
 
0.8%
Exhibition designer9199
 
0.7%
Naval architect8684
 
0.7%
Surveyor, land/geomatics8680
 
0.7%
Materials engineer8270
 
0.6%
Designer, ceramics/pottery8225
 
0.6%
Systems developer7700
 
0.6%
IT trainer7679
 
0.6%
Financial adviser7659
 
0.6%
Environmental consultant7547
 
0.6%
Other values (484)1213253
93.6%

Length

2023-02-01T11:27:17.237606image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
engineer131756
 
4.6%
officer110915
 
3.9%
manager61124
 
2.1%
scientist55878
 
1.9%
designer52218
 
1.8%
surveyor49062
 
1.7%
teacher38126
 
1.3%
psychologist32600
 
1.1%
research29754
 
1.0%
editor28725
 
1.0%
Other values (456)2289024
79.5%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

dob
Categorical

HIGH CARDINALITY

Distinct968
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size9.9 MiB
1977-03-23
 
5636
1981-08-29
 
4636
1988-09-15
 
4623
1955-05-06
 
3661
1995-07-12
 
3123
Other values (963)
1274996 

Length

Max length10
Median length10
Mean length10
Min length10

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1988-03-09
2nd row1978-06-21
3rd row1962-01-19
4th row1967-01-12
5th row1986-03-28

Common Values

ValueCountFrequency (%)
1977-03-235636
 
0.4%
1981-08-294636
 
0.4%
1988-09-154623
 
0.4%
1955-05-063661
 
0.3%
1995-07-123123
 
0.2%
1983-07-253123
 
0.2%
1987-10-283119
 
0.2%
1984-06-033117
 
0.2%
1999-03-053113
 
0.2%
1998-03-193112
 
0.2%
Other values (958)1259412
97.1%

Length

2023-02-01T11:27:17.328608image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
1977-03-235636
 
0.4%
1981-08-294636
 
0.4%
1988-09-154623
 
0.4%
1955-05-063661
 
0.3%
1995-07-123123
 
0.2%
1983-07-253123
 
0.2%
1987-10-283119
 
0.2%
1984-06-033117
 
0.2%
1999-03-053113
 
0.2%
1998-03-193112
 
0.2%
Other values (958)1259412
97.1%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

trans_num
Categorical

HIGH CARDINALITY
UNIFORM
UNIQUE

Distinct1296675
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size9.9 MiB
32e4534ec328b0dc06e915376ac45f66
 
1
0c0598ad26b0a46ef55af16eb1f644ad
 
1
d14057dd4916c3020246fc38ef88bc1e
 
1
00b0e841d9d663c50800a3d8a58d89bd
 
1
162f4d53cd0f7ec22f8aca076ffafd7f
 
1
Other values (1296670)
1296670 

Length

Max length32
Median length32
Mean length32
Min length32

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1296675 ?
Unique (%)100.0%

Sample

1st row0b242abb623afc578575680df30655b9
2nd row1f76529f8574734946361c461b024d99
3rd rowa1a22d70485983eac12b5b88dad1cf95
4th row6b849c168bdad6f867558c3793159a81
5th rowa41d7549acf90789359a9aa5346dcb46

Common Values

ValueCountFrequency (%)
32e4534ec328b0dc06e915376ac45f661
 
< 0.1%
0c0598ad26b0a46ef55af16eb1f644ad1
 
< 0.1%
d14057dd4916c3020246fc38ef88bc1e1
 
< 0.1%
00b0e841d9d663c50800a3d8a58d89bd1
 
< 0.1%
162f4d53cd0f7ec22f8aca076ffafd7f1
 
< 0.1%
f8782fa5f2053f9a77fd8ecfbe7fa9cf1
 
< 0.1%
98af7a381e33ad6802cdb2f1c92e36751
 
< 0.1%
0e2fef0cf6ff9150cb15508adc002f261
 
< 0.1%
98b518878be5dd03e12495f5b701ee111
 
< 0.1%
b4a11538e13cb33e5c2f4dcf6c10cf7c1
 
< 0.1%
Other values (1296665)1296665
> 99.9%

Length

2023-02-01T11:27:17.473733image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
f7eff44f07d5da2fcaffb88a9ecb97d31
 
< 0.1%
58655490d3993406721066dcd484eaf81
 
< 0.1%
e5fc727fed2d17134bd12ddac6051d551
 
< 0.1%
4f7842851f43cb611813026ea3af731f1
 
< 0.1%
b9707bb1ccd11bfd1a30d1e6d67a9c031
 
< 0.1%
5787bd42842877f928d2e57a33e4512a1
 
< 0.1%
c7dc28f67636b7421cf50662346bc7ca1
 
< 0.1%
3dde4f84a3cc21ce78f425b8bcd9ff341
 
< 0.1%
1778f7152cada79befdfb42b8b2112871
 
< 0.1%
22efdcfff6586eae277a75a12b204c0a1
 
< 0.1%
Other values (1296665)1296665
> 99.9%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

unix_time
Real number (ℝ≥0)

Distinct1274823
Distinct (%)98.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1349243637
Minimum1325376018
Maximum1371816817
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size9.9 MiB
2023-02-01T11:27:17.573824image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum1325376018
5-th percentile1328671975
Q11338750742
median1349249747
Q31359385376
95-th percentile1369830595
Maximum1371816817
Range46440799
Interquartile range (IQR)20634633

Descriptive statistics

Standard deviation12841278.42
Coefficient of variation (CV)0.009517390391
Kurtosis-1.087540501
Mean1349243637
Median Absolute Deviation (MAD)10358807
Skewness0.003377949757
Sum1.749530493 × 1015
Variance1.648984315 × 1014
MonotonicityIncreasing
2023-02-01T11:27:17.794024image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
13351105214
 
< 0.1%
13701772274
 
< 0.1%
13700506674
 
< 0.1%
13561443373
 
< 0.1%
13344327453
 
< 0.1%
13546306003
 
< 0.1%
13494040363
 
< 0.1%
13380440313
 
< 0.1%
13349662713
 
< 0.1%
13451297313
 
< 0.1%
Other values (1274813)1296642
> 99.9%
ValueCountFrequency (%)
13253760181
< 0.1%
13253760441
< 0.1%
13253760511
< 0.1%
13253760761
< 0.1%
13253761861
< 0.1%
13253762481
< 0.1%
13253762821
< 0.1%
13253763081
< 0.1%
13253763181
< 0.1%
13253763611
< 0.1%
ValueCountFrequency (%)
13718168171
< 0.1%
13718168161
< 0.1%
13718167521
< 0.1%
13718167391
< 0.1%
13718167281
< 0.1%
13718166961
< 0.1%
13718166831
< 0.1%
13718166561
< 0.1%
13718165621
< 0.1%
13718165221
< 0.1%

merch_lat
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct1247805
Distinct (%)96.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean38.53733804
Minimum19.027785
Maximum67.510267
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size9.9 MiB
2023-02-01T11:27:17.896917image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum19.027785
5-th percentile29.7516534
Q134.733572
median39.36568
Q341.957164
95-th percentile46.0035301
Maximum67.510267
Range48.482482
Interquartile range (IQR)7.223592

Descriptive statistics

Standard deviation5.10978837
Coefficient of variation (CV)0.1325931844
Kurtosis0.79599391
Mean38.53733804
Median Absolute Deviation (MAD)3.397536
Skewness-0.1819154297
Sum49970402.81
Variance26.10993718
MonotonicityNot monotonic
2023-02-01T11:27:17.995390image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
34.0484394
 
< 0.1%
42.7491844
 
< 0.1%
41.5681284
 
< 0.1%
40.7720964
 
< 0.1%
41.6324884
 
< 0.1%
41.9101924
 
< 0.1%
39.8458494
 
< 0.1%
40.5501994
 
< 0.1%
37.6697884
 
< 0.1%
40.2770864
 
< 0.1%
Other values (1247795)1296635
> 99.9%
ValueCountFrequency (%)
19.0277851
< 0.1%
19.0278041
< 0.1%
19.0297981
< 0.1%
19.0312421
< 0.1%
19.0322771
< 0.1%
19.0332881
< 0.1%
19.0342821
< 0.1%
19.0346871
< 0.1%
19.0354721
< 0.1%
19.0363121
< 0.1%
ValueCountFrequency (%)
67.5102671
< 0.1%
67.4415181
< 0.1%
67.3970181
< 0.1%
67.1881111
< 0.1%
67.0642771
< 0.1%
66.8351741
< 0.1%
66.6829051
< 0.1%
66.673551
< 0.1%
66.6646731
< 0.1%
66.6592421
< 0.1%

merch_long
Real number (ℝ)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct1275745
Distinct (%)98.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean-90.2264648
Minimum-166.671242
Maximum-66.950902
Zeros0
Zeros (%)0.0%
Negative1296675
Negative (%)100.0%
Memory size9.9 MiB
2023-02-01T11:27:18.100266image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum-166.671242
5-th percentile-119.3300916
Q1-96.8972755
median-87.438392
Q3-80.2367965
95-th percentile-73.3542179
Maximum-66.950902
Range99.72034
Interquartile range (IQR)16.660479

Descriptive statistics

Standard deviation13.77109056
Coefficient of variation (CV)-0.1526280631
Kurtosis1.848479176
Mean-90.2264648
Median Absolute Deviation (MAD)8.227889
Skewness-1.146959945
Sum-116994401.2
Variance189.6429353
MonotonicityNot monotonic
2023-02-01T11:27:18.197389image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
-87.1164144
 
< 0.1%
-74.6182694
 
< 0.1%
-81.2191894
 
< 0.1%
-86.7632943
 
< 0.1%
-80.5758643
 
< 0.1%
-81.2100153
 
< 0.1%
-73.6348793
 
< 0.1%
-82.2839193
 
< 0.1%
-81.4580973
 
< 0.1%
-80.9008993
 
< 0.1%
Other values (1275735)1296642
> 99.9%
ValueCountFrequency (%)
-166.6712421
< 0.1%
-166.6701321
< 0.1%
-166.6696381
< 0.1%
-166.6661791
< 0.1%
-166.6648281
< 0.1%
-166.6628881
< 0.1%
-166.6619681
< 0.1%
-166.6592771
< 0.1%
-166.6578341
< 0.1%
-166.6571741
< 0.1%
ValueCountFrequency (%)
-66.9509021
< 0.1%
-66.9559961
< 0.1%
-66.956541
< 0.1%
-66.9586591
< 0.1%
-66.9587511
< 0.1%
-66.9591781
< 0.1%
-66.9619231
< 0.1%
-66.9629131
< 0.1%
-66.9639181
< 0.1%
-66.9639751
< 0.1%

is_fraud
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size9.9 MiB
0
1289169 
1
 
7506

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
01289169
99.4%
17506
 
0.6%

Length

2023-02-01T11:27:18.291502image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2023-02-01T11:27:18.343068image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
01289169
99.4%
17506
 
0.6%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

Interactions

2023-02-01T11:27:03.794876image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-02-01T11:26:43.630915image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-02-01T11:26:46.136648image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-02-01T11:26:48.546070image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-02-01T11:26:51.113615image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-02-01T11:26:53.653738image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-02-01T11:26:56.152619image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-02-01T11:26:58.807559image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-02-01T11:27:01.200102image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-02-01T11:27:04.078318image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-02-01T11:26:43.916236image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-02-01T11:26:46.403250image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-02-01T11:26:48.830295image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-02-01T11:26:51.394901image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-02-01T11:26:53.932911image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-02-01T11:26:56.450305image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-02-01T11:26:59.069528image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-02-01T11:27:01.492338image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-02-01T11:27:04.362036image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-02-01T11:26:44.188974image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-02-01T11:26:46.676985image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-02-01T11:26:49.105467image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-02-01T11:26:51.692260image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-02-01T11:26:54.218902image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-02-01T11:26:56.752330image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-02-01T11:26:59.328893image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-02-01T11:27:01.785941image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-02-01T11:27:04.639962image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-02-01T11:26:44.471171image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-02-01T11:26:46.944027image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-02-01T11:26:49.391818image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-02-01T11:26:51.962241image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-02-01T11:26:54.496612image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-02-01T11:26:57.055938image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-02-01T11:26:59.596370image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-02-01T11:27:02.066574image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-02-01T11:27:04.915642image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-02-01T11:26:44.748893image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-02-01T11:26:47.203948image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-02-01T11:26:49.678180image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-02-01T11:26:52.241037image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-02-01T11:26:54.763239image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-02-01T11:26:57.350368image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-02-01T11:26:59.856768image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-02-01T11:27:02.346924image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-02-01T11:27:05.202850image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-02-01T11:26:45.022640image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-02-01T11:26:47.474118image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-02-01T11:26:49.952216image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-02-01T11:26:52.532525image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-02-01T11:26:55.048324image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-02-01T11:26:57.642943image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-02-01T11:27:00.120939image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-02-01T11:27:02.633742image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-02-01T11:27:05.488832image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-02-01T11:26:45.304283image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-02-01T11:26:47.747085image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-02-01T11:26:50.234492image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-02-01T11:26:52.821084image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-02-01T11:26:55.336427image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-02-01T11:26:57.942144image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-02-01T11:27:00.377130image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-02-01T11:27:02.927308image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-02-01T11:27:05.771769image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-02-01T11:26:45.593491image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-02-01T11:26:48.010236image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-02-01T11:26:50.520687image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-02-01T11:26:53.099887image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-02-01T11:26:55.612107image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-02-01T11:26:58.245280image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-02-01T11:27:00.649534image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-02-01T11:27:03.209943image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-02-01T11:27:06.042168image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-02-01T11:26:45.867406image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-02-01T11:26:48.273453image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-02-01T11:26:50.816772image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-02-01T11:26:53.376151image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-02-01T11:26:55.895548image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-02-01T11:26:58.537045image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-02-01T11:27:00.913249image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-02-01T11:27:03.493243image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Correlations

2023-02-01T11:27:18.405790image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2023-02-01T11:27:18.541940image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2023-02-01T11:27:18.691428image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2023-02-01T11:27:18.830411image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.
2023-02-01T11:27:18.939289image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2023-02-01T11:27:06.632124image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
A simple visualization of nullity by column.
2023-02-01T11:27:08.511409image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

First rows

trans_date_trans_timecc_nummerchantcategoryamtfirstlastgenderstreetcitystateziplatlongcity_popjobdobtrans_numunix_timemerch_latmerch_longis_fraud
02019-01-01 00:00:182703186189652095fraud_Rippin, Kub and Mannmisc_net4.97JenniferBanksF561 Perry CoveMoravian FallsNC2865436.0788-81.17813495Psychologist, counselling1988-03-090b242abb623afc578575680df30655b9132537601836.011293-82.0483150
12019-01-01 00:00:44630423337322fraud_Heller, Gutmann and Ziemegrocery_pos107.23StephanieGillF43039 Riley Greens Suite 393OrientWA9916048.8878-118.2105149Special educational needs teacher1978-06-211f76529f8574734946361c461b024d99132537604449.159047-118.1864620
22019-01-01 00:00:5138859492057661fraud_Lind-Buckridgeentertainment220.11EdwardSanchezM594 White Dale Suite 530Malad CityID8325242.1808-112.26204154Nature conservation officer1962-01-19a1a22d70485983eac12b5b88dad1cf95132537605143.150704-112.1544810
32019-01-01 00:01:163534093764340240fraud_Kutch, Hermiston and Farrellgas_transport45.00JeremyWhiteM9443 Cynthia Court Apt. 038BoulderMT5963246.2306-112.11381939Patent attorney1967-01-126b849c168bdad6f867558c3793159a81132537607647.034331-112.5610710
42019-01-01 00:03:06375534208663984fraud_Keeling-Cristmisc_pos41.96TylerGarciaM408 Bradley RestDoe HillVA2443338.4207-79.462999Dance movement psychotherapist1986-03-28a41d7549acf90789359a9aa5346dcb46132537618638.674999-78.6324590
52019-01-01 00:04:084767265376804500fraud_Stroman, Hudson and Erdmangas_transport94.63JenniferConnerF4655 David IslandDublinPA1891740.3750-75.20452158Transport planner1961-06-19189a841a0a8ba03058526bcfe566aab5132537624840.653382-76.1526670
62019-01-01 00:04:4230074693890476fraud_Rowe-Vandervortgrocery_net44.54KelseyRichardsF889 Sarah Station Suite 624HolcombKS6785137.9931-100.98932691Arboriculturist1993-08-1683ec1cc84142af6e2acf10c44949e720132537628237.162705-100.1533700
72019-01-01 00:05:086011360759745864fraud_Corwin-Collinsgas_transport71.65StevenWilliamsM231 Flores Pass Suite 720EdinburgVA2282438.8432-78.60036018Designer, multimedia1947-08-216d294ed2cc447d2c71c7171a3d54967c132537630838.948089-78.5402960
82019-01-01 00:05:184922710831011201fraud_Herzog Ltdmisc_pos4.27HeatherChaseF6888 Hicks Stream Suite 954ManorPA1566540.3359-79.66071472Public affairs consultant1941-03-07fc28024ce480f8ef21a32d64c93a29f5132537631840.351813-79.9581460
92019-01-01 00:06:012720830304681674fraud_Schoen, Kuphal and Nitzschegrocery_pos198.39MelissaAguilarF21326 Taylor Squares Suite 708ClarksvilleTN3704036.5220-87.3490151785Pathologist1974-03-283b9014ea8fb80bd65de0b1463b00b00e132537636137.179198-87.4853810

Last rows

trans_date_trans_timecc_nummerchantcategoryamtfirstlastgenderstreetcitystateziplatlongcity_popjobdobtrans_numunix_timemerch_latmerch_longis_fraud
12966652020-06-21 12:08:42213193596103206fraud_Gulgowski LLChome72.17JamesHuntM7369 Gabriel TunnelPointe Aux PinsMI4977545.7549-84.447095Electrical engineer1994-02-09108c103b26f686c24c021aaf4210977e137181652244.938461-83.9962340
12966662020-06-21 12:09:224587657402165341815fraud_Hyatt, Russel and Gleichnerhealth_fitness7.30AmberLewisF6296 John Keys Suite 858Pembroke TownshipIL6095841.0646-87.59172135Psychotherapist, child2004-05-0837a18c6fb0c5c722b6339ffedc82f55a137181656240.556811-88.0923390
12966672020-06-21 12:10:564822367783500458fraud_Hahn, Douglas and Schowaltertravel19.71ChristopherFarrellM97070 Anderson LandHaines CityFL3384428.0758-81.592933804Exercise physiologist1991-01-0134e72e0a659a6c8f4a20ee65594f3a7d137181665627.465871-81.5118040
12966682020-06-21 12:11:23213141712584544fraud_Metz, Russel and Metzkids_pets100.85MargaretCurtisF742 Oneill ShoreFlorenceMS3907332.1530-90.121719685Fine artist1984-12-240d86d8c17638d7eff77db9c6a878b477137181668331.377697-90.5284500
12966692020-06-21 12:11:364400011257587661852fraud_Stiedemann Incmisc_pos37.38MarissaPowellF474 Allen HavenNorth LoupNE6885941.4972-98.7858509Nurse, children's1980-09-159a7ea2625cf8303efe34e3c09546868f137181669641.728638-99.0396600
12966702020-06-21 12:12:0830263540414123fraud_Reichel Incentertainment15.56ErikPattersonM162 Jessica Row Apt. 072HatchUT8473537.7175-112.4777258Geoscientist1961-11-24440b587732da4dc1a6395aba5fb41669137181672836.841266-111.6907650
12966712020-06-21 12:12:196011149206456997fraud_Abernathy and Sonsfood_dining51.70JeffreyWhiteM8617 Holmes Terrace Suite 651TuscaroraMD2179039.2667-77.5101100Production assistant, television1979-12-11278000d2e0d2277d1de2f890067dcc0a137181673938.906881-78.2465280
12966722020-06-21 12:12:323514865930894695fraud_Stiedemann Ltdfood_dining105.93ChristopherCastanedaM1632 Cohen Drive Suite 639High Rolls Mountain ParkNM8832532.9396-105.8189899Naval architect1967-08-30483f52fe67fabef353d552c1e662974c137181675233.619513-105.1305290
12966732020-06-21 12:13:362720012583106919fraud_Reinger, Weissnat and Strosinfood_dining74.90JosephMurrayM42933 Ryan UnderpassMandersonSD5775643.3526-102.54111126Volunteer coordinator1980-08-18d667cdcbadaaed3da3f4020e83591c83137181681642.788940-103.2411600
12966742020-06-21 12:13:374292902571056973207fraud_Langosh, Wintheiser and Hyattfood_dining4.30JeffreySmithM135 Joseph MountainsSulaMT5987145.8433-113.8748218Therapist, horticultural1995-08-168f7c8e4ab7f25875d753b422917c98c9137181681746.565983-114.1861100